Fuzzy Clustering of Parallel Data Streams

نویسندگان

  • Jürgen Beringer
  • Eyke Hüllermeier
چکیده

The management and processing of so-called data streams has recently become a topic of active research in several fields of computer science, notably database systems and data mining. A data stream can roughly be thought of as a transient, continuously increasing sequence of time-stamped data. In this paper, we consider the problem of clustering parallel streams of real-valued data, that is to say, continuously evolving time series. More specifically, we are interested in grouping data streams the evolution over time of which is similar in a specific sense. In order to maintain an up-to-date clustering structure, it is necessary to analyze the incoming data in an online manner, tolerating not more than a constant time delay. For this purpose, we develop an efficient online version of the fuzzy C-means clustering algorithm. A fuzzy approach appears to be particularly useful for this type of application, in which the clustering structure is subject to continuous changes.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Online-Data-Mining auf Datenströmen: Methoden zur Clusteranalyse und Klassifikation

• J. Beringer and E. Hüllermeier. Efficient instance based learning on data streams. Adaptive optimization of the number of clusters in fuzzy clustering. Fuzzy clustering of parallel data streams. Adaptive optimization of the number of clusters in fuzzy clustering.

متن کامل

High Performance Implementation of Fuzzy C-Means and Watershed Algorithms for MRI Segmentation

Image segmentation is one of the most common steps in digital image processing. The area many image segmentation algorithms (e.g., thresholding, edge detection, and region growing) employed for classifying a digital image into different segments. In this connection, finding a suitable algorithm for medical image segmentation is a challenging task due to mainly the noise, low contrast, and steep...

متن کامل

High Performance Implementation of Fuzzy C-Means and Watershed Algorithms for MRI Segmentation

Image segmentation is one of the most common steps in digital image processing. The area many image segmentation algorithms (e.g., thresholding, edge detection, and region growing) employed for classifying a digital image into different segments. In this connection, finding a suitable algorithm for medical image segmentation is a challenging task due to mainly the noise, low contrast, and steep...

متن کامل

Incorporation of Non-euclidean Distance Metrics into Fuzzy Clustering on Graphics Processing Units

Computational tractability of clustering algorithms becomes a problem as the number of data points, feature dimensionality, and number of clusters increase. Graphics Processing Units (GPUs) are low cost, high performance stream processing architectures used currently by the gaming, movie, and computer aided design industries. Fuzzy clustering is a pattern recognition algorithm that has a great ...

متن کامل

Online fuzzy medoid based clustering algorithms

This paper describes two new online fuzzy clustering algorithms based on medoids. These algorithms have been developed to deal with either very large datasets that do not fit in main memory or data streams in which data are produced continuously. The innovative aspect of our approach is the combination of fuzzy methods, which are well adapted to outliers and overlapping clusters, with medoids a...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2008